CMICH-Zhang-MC1

VAST 2012 Challenge
Mini-Challenge 1: Bank of Money Enterprise: Cyber Situation Awareness

 

 

Team Members:

 

Tao Zhang, Central Michigan University, zhang3t@cmich.edu     PRIMARY
Qi Liao, Central Michigan University,
qi.liao@cmich.edu

Lei Shi, Institute of Software, Chinese Academy of Sciences, - China, shijim@gmail.com

Student Team:   YES

 

Tool(s):

 

Google Earth, developed by Google Inc., http://www.google.com/earth/index.html

Python, http://www.python.org/

3DBarModelGenerator4Kml, developed by Zhang, CMICH

 

Video:

 

CMICH-Zhang-MC1.wmv

 

 

Answers to Mini-Challenge 1 Questions:

 

MC 1.1  Create a visualization of the health and policy status of the entire Bank of Money enterprise as of 2 pm BMT (BankWorld Mean Time) on February 2. What areas of concern do you observe? 

 

The following 3 snapshots (Figure 1.1-1.3) shows the distribution of all machines’ policy and activity status aggregated by 50 regions (40 small, 10 large and the headquarter). The yellow 3D bars represent the sum of all machines' policy status in each region, by the bar height, or say, altitude. Each single policy value at a machine is rescaled by a minus of one, so that a policy status of ‘1’ contributes a zerio bar height, larger policy value contributes positive values. The 3D bars in red represent the sum of all machines' activity flags in each region. Each activity flag at a machine with a value of '1' ('normal') contributes a zero value, otherwise the activity flags larger than ‘1’ will contribute a ‘1’ value to the summed activity.)

 

This visualizations show an overview picture at one time by comparing every region's abnormalty degree as different 3D bars. There are three areas we may take special concerns from the visualization. The first area is 'Hatt', region 10, which has the highest policy status bar. The second area, 'Lomu',  region 5, was chosen by the same reason: it has the second highest policy status bar in the bank of world. The last area is ‘Zizzer’, the region ' headquarter '. These three regions' policy status bars show big positive bias compared to other regions. The headquarter's activity flag bar also shows a big plus in height which worth investigating. 

 

Figure 1.1: region 10, the Hatt.

 

Figure 1.2: region 5, the Lomu.

 

Figure 1.3: headquarter, the Zizzer

 

MC 1.2  Use your visualization tools to look at how the network’s status changes over time. Highlight up to five potential anomalies in the network and provide a visualization of each. When did each anomaly begin and end? What might be an explanation of each anomaly?

 

Again, our visualization use region as the aggregation granularity. By using BMT time as another dimension, we obtain a time serie of status distributions, one per 15 minutes. But this time, we only show one bar per region, corresponding to either the connnection, activity or policy sum. The calculations for the three attributes (numConnections, policyStatus, activityFlag) are as such: A mean numConnection is used by dividing the sum value by the region’s machine number. The policyStatus have 5 values which indicate how serious the policy deviating from the machine is undergoing, from 1 (normal) to 5 (very dangerous). In order to emphasize the abnormalities, the policyStatus value is minus by 1 before summed together, so that the normal machine’s policy (‘1’) will not be counted. The activityFlag attribute have 5 values. Value ‘1’ means working normally, value ’2 - 4’ mean different abnormal activities on one machine. All the ‘2-4’ values worth investigating so that the value of ‘1’ is counted as ‘0’ and all the other value as ‘1’. After this calculation, the summed value will let us know how many abnormal machines are in the region. Movie visualization is used to connecting the dots between timelines.

 

After data preparation, we leverage a Kml file generator written by python, to collect each region’s information (location, attributes, time and so on) and generate Kml files. These Kml files can be viewed by GIS Systems such as Google Earth.

 

The reason for grouping by regions but not by a more detailed unit (e.g. branch) is that, using branch will generate more than 4000 3D bars, which will hardly be displayed and recognized by human beings. But grouping by branch will be an appropriate way to represent more details when studying only one region.

 

 

Anomaly #1: activity flag change anomalously

Anomaly location: the large regional offices (region 1 – 10 and headquarter)

Anomaly time: 11am - 3am(next day), everyday

Anomaly degree: during this time, the activityflag value for each large regional office rises up with a big plus compared to the small regional offices around them. And this higher activityflag phenomenon lasts until 3am each day.

Anomaly explanation: By our visualization, activityflag bar's height represents the number of the abnormal machines. The high frequency of abnormal situation might because of the higher usage of the machine in the large regional offices. The reason why issues last till 3am each day might because the maintenance will be able to apply on a daily time.

 

Figue 2.1. 11am activityflag

 

Figure 2.2. 5.22pm, activity flag

 

Figure 2.3 3am, the activity flag

 

Anomaly #2: policy status change anomaly

Anomaly location: large regional offices (region 1 – 10 and headquarter)

Anomaly time: keep rising along with the time

Anomaly degree: the region - 5 and region - 10 start with a higher policy status bar. During the log data’s time, all policystatus value for each large regional office rises up with a big plus compare to the small regional offices around them. All the bars almost keep rising until the end of time. The headquarter 's policystatus bar is the highest one at the end of the time.

Anomaly explanation:

Through our visualization, the policy status bar's height represents how severe a policy issue a machine has. A higher value might indicate the region is being attacked. The rising patterns of the policy status can be a critical issue for BoM, especially in headquarters. It seems that most policy deviation warnings still exist in the end, while the sum of policystatus keeps increasing. But it is only one assumption. Another possibility might be the policyStatus’s increase happens in all large regional offices, because the large regional offices are common targets for various intrusion attempts.

Figure 2.4. Policy status at the beginning of time

 

Figure 2.5. Policy status keep rising in the middle of time

 

Figure 2.6. Headquarters with the highest policy status in the end